Kafnets: kernel-based non-parametric activation functions for neural networks
نویسندگان
چکیده
Neural networks are generally built by interleaving (adaptable) linear layers with (fixed) nonlinear activation functions. To increase their flexibility, several authors have proposed methods for adapting the activation functions themselves, endowing them with varying degrees of flexibility. None of these approaches, however, have gained wide acceptance in practice, and research in this topic remains open. In this paper, we introduce a novel family of flexible activation functions that are based on an inexpensive kernel expansion at every neuron. Leveraging over several properties of kernel-based models, we propose multiple variations for designing and initializing these kernel activation functions (KAFs), including a multidimensional scheme allowing to nonlinearly combine information from different paths in the network. The resulting KAFs can approximate any mapping defined over a subset of the real line, either convex or nonconvex. Furthermore, they are smooth over their entire domain, linear in their parameters, and they can be regularized using any known scheme, including the use of `1 penalties to enforce sparseness. To the best of our knowledge, no other known model satisfies all these properties simultaneously. In addition, we provide a relatively complete overview on al∗Corresponding author. Phone: +39 06 44585495, Fax: +39 06 4873300. Email addresses: [email protected] (Simone Scardapane), [email protected] (Steven Van Vaerenbergh), [email protected] (Aurelio Uncini) Preprint submitted to Neural Networks November 27, 2017 ar X iv :1 70 7. 04 03 5v 2 [ st at .M L ] 2 3 N ov 2 01 7 ternative techniques for adapting the activation functions, which is currently lacking in the literature. A large set of experiments validates our proposal.
منابع مشابه
Siamese Networks for One Shot Learning using Kernel Based Activation functions
The lack of a large amount of training data has always been the constraining factor in solving a lot of problems in machine learning, making One Shot Learning one of the most intriguing ideas in machine learning. It aims to learn information about object categories from one, or only a few, training examples, and for certain image classification tasks, has successfully been able to get results c...
متن کاملComplex-valued Neural Networks with Non-parametric Activation Functions
Complex-valued neural networks (CVNNs) are a powerful modeling tool for domains where data can be naturally interpreted in terms of complex numbers. However, several analytical properties of the complex domain (e.g., holomorphicity) make the design of CVNNs a more challenging task than their real counterpart. In this paper, we consider the problem of flexible activation functions (AFs) in the c...
متن کاملImproving Graph Convolutional Networks with Non-Parametric Activation Functions
Graph neural networks (GNNs) are a class of neural networks that allow to efficiently perform inference on data that is associated to a graph structure, such as, e.g., citation networks or knowledge graphs. While several variants of GNNs have been proposed, they only consider simple nonlinear activation functions in their layers, such as rectifiers or squashing functions. In this paper, we inve...
متن کاملDirect Density Ratio Estimation with Convolutional Neural Networks with Application in Outlier Detection
Recently, the ratio of probability density functions was demonstrated to be useful in solving various machine learning tasks such as outlier detection, non-stationarity adaptation, feature selection, and clustering. The key idea of this density ratio approach is that the ratio is directly estimated so that difficult density estimation is avoided. So far, parametric and non-parametric direct den...
متن کاملNumerical treatment for nonlinear steady flow of a third grade fluid in a porous half space by neural networks optimized
In this paper, steady flow of a third-grade fluid in a porous half space has been considered. This problem is a nonlinear two-point boundary value problem (BVP) on semi-infinite interval. The solution for this problem is given by a numerical method based on the feed-forward artificial neural network model using radial basis activation functions trained with an interior point method. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1707.04035 شماره
صفحات -
تاریخ انتشار 2017